On the minimum FLOPs problem in the sparse Cholesky factorization

نویسندگان

  • Robert Luce
  • Esmond G. Ng
چکیده

Prior to computing the Cholesky factorization of a sparse, symmetric positive definite matrix, a reordering of the rows and columns is computed so as to reduce both the number of fill elements in Cholesky factor and the number of arithmetic operations (FLOPs) in the numerical factorization. These two metrics are clearly somehow related and yet it is suspected that these two problems are different. However, no rigorous theoretical treatment of the relation of these two problems seems to have been given yet. In this paper we show by means of an explicit, scalable construction that the two problems are different in a very strict sense. In our construction no ordering, that is optimal for the fill, is optimal with respect to the number of FLOPs, and vice versa. Further, it is commonly believed that minimizing the number of FLOPs is no easier than minimizing the fill (in the complexity sense), but so far no proof appears to be known. We give a reduction chain that shows the NP hardness of minimizing the number of arithmetic operations in the Cholesky factorization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Sparse Cholesky Factorization on a Massively Parallel SIMD Computer

We investigate the effect of load balancing when performing Cholesky factorization on a massively parallel SIMD computer. In particular we describe a supernodal algorithm for performing sparse Cholesky factorization. The way the matrix is mapped onto the processors has significant effect on its efficiency. We show that this assignment problem can be modeled as a graph coloring problem in a weig...

متن کامل

Fast Sparse Matrix Factorization on Modern Workstations

The performance of workstation-class machines has experienced a dramatic increase in the recent past. Relatively inexpensive machines which offer 14 MIPS and 2 MFLOPS performance are now available, and machines with even higher performance are not far off. One important characteristic of these machines is that they rely on a small amount of high-speed cache memory for their high performance. In...

متن کامل

Improving Performance of Hypermatrix Cholesky Factorization

This paper shows how a sparse hypermatrix Cholesky factorization can be improved. This is accomplished by means of efficient codes which operate on very small dense matrices. Different matrix sizes or target platforms may require different codes to obtain good performance. We write a set of codes for each matrix operation using different loop orders and unroll factors. Then, for each matrix siz...

متن کامل

Implementing a parallel matrix factorization library on the cell broadband engine

Matrix factorization (or often called decomposition) is a frequently used kernel in a large number of applications ranging from linear solvers to data clustering and machine learning. The central contribution of this paper is a thorough performance study of four popular matrix factorization techniques, namely, LU, Cholesky, QR, and SVD on the STI Cell broadband engine. The paper explores algori...

متن کامل

A Scalable Parallel Algorithm for Sparse MatrixFactorization

In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze its performance and scalability, and present experimental results of its implementation on a 1024-processor nCUBE2 parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm improves the state of the art in parallel direct solution of sparse linear systems b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • SIAM J. Matrix Analysis Applications

دوره 35  شماره 

صفحات  -

تاریخ انتشار 2014